RELPRON: A Relative Clause Evaluation Data Set for Compositional Distributional Semantics
نویسندگان
چکیده
This article introduces RELPRON, a large data set of subject and object relative clauses, for the evaluation of methods in compositional distributional semantics. RELPRON targets an intermediate level of grammatical complexity between content-word pairs and full sentences. The task involves matching terms, such as “wisdom,” with representative properties, such as “quality that experience teaches.” A unique feature of RELPRON is that it is built from attested properties, but without the need for them to appear in relative clause format in the source corpus. The article also presents some initial experiments on RELPRON, using a variety of composition methods including simple baselines, arithmetic operators on vectors, and finally, more complex methods in which argument-taking words are represented as tensors. The latter methods are based on the Categorial framework, which is described in detail. The results show that vector addition is difficult to beat—in line with the existing literature—but that an implementation of the Categorial framework based on the Practical Lexical Function model is able to match the performance of vector addition. The article finishes with an in-depth analysis of RELPRON, showing how results vary across subject and object relative clauses, across different head nouns, and how the methods perform on the subtasks necessary for capturing relative clause
منابع مشابه
Introducing Structure into Neural Network-Based Semantic Models
In this talk I will describe two attempts at introducing syntactic structure into semantic models using neural network architectures. The first study focuses on a particular grammatical construction, namely relative clauses, and centers around the design of a new dataset for testing compositional distributional models. The dataset is called RELPRON, and consists of pairs of terms and properties...
متن کاملCompositional-ly Derived Representations of Morphologically Complex Words in Distributional Semantics
Speakers of a language can construct an unlimited number of new words through morphological derivation. This is a major cause of data sparseness for corpus-based approaches to lexical semantics, such as distributional semantic models of word meaning. We adapt compositional methods originally developed for phrases to the task of deriving the distributional meaning of morphologically complex word...
متن کاملEstimating Linear Models for Compositional Distributional Semantics
In distributional semantics studies, there is a growing attention in compositionally determining the distributional meaning of word sequences. Yet, compositional distributional models depend on a large set of parameters that have not been explored. In this paper we propose a novel approach to estimate parameters for a class of compositional distributional models: the additive models. Our approa...
متن کاملSICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment
This paper is an extended description of SemEval-2014 Task 1, the task on the evaluation of Compositional Distributional Semantics Models on full sentences. Systems participating in the task were presented with pairs of sentences and were evaluated on their ability to predict human judgments on (i) semantic relatedness and (ii) entailment. Training and testing data were subsets of the SICK (Sen...
متن کاملCollaborative Training of Tensors for Compositional Distributional Semantics
Type-based compositional distributional semantic models present an interesting line of research into functional representations of linguistic meaning. One of the drawbacks of such models, however, is the lack of training data required to train each word-type combination. In this paper we address this by introducing training methods that share parameters between similar words. We show that these...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Linguistics
دوره 42 شماره
صفحات -
تاریخ انتشار 2016